Fair Algorithms for Infinite and Contextual Bandits
نویسندگان
چکیده
Motivated by concerns that automated decision-making procedures can unintentionally lead to discriminatory behavior, we study a technical definition of fairness modeled after John Rawls’ notion of “fair equality of opportunity”. In the context of a simple model of online decision making, we give an algorithm that satisfies this fairness constraint, while still being able to learn at a rate that is comparable to (but necessarily worse than) that of the best algorithms absent a fairness constraint. We prove a regret bound for fair algorithms in the linear contextual bandit framework that is a significant improvement over our technical companion paper [16], which gives black-box reductions in a more general setting. We analyze our algorithms both theoretically and experimentally. Finally, we introduce the notion of a “discrimination index”, and show that standard algorithms for our problem exhibit structured discriminatory behavior, whereas the “fair” algorithms we develop do not. ∗majos, mkearns, jamiemor, [email protected]. Department of Computer and Information Sciences, University of Pennsylvania. †[email protected]. Department of Statistics, The Wharton School, University of Pennsylvania. 1 ar X iv :1 61 0. 09 55 9v 2 [ cs .L G ] 1 N ov 2 01 6
منابع مشابه
Fair Algorithms for Infinite Contextual Bandits
We study fairness in infinite linear bandit problems. Starting from the notion of meritocratic fairness introduced in Joseph et al. [9], we expand their notion of fairness for infinite action spaces and provide an algorithm that obtains a sublinear but instance-dependent regret guarantee. We then show that this instance dependence is a necessary cost of our fairness definition with a matching l...
متن کاملFairness in Learning: Classic and Contextual Bandits
We introduce the study of fairness in multi-armed bandit problems. Our fairness definition demands that, given a pool of applicants, a worse applicant is never favored over a better one, despite a learning algorithm’s uncertainty over the true payoffs. In the classic stochastic bandits problem we provide a provably fair algorithm based on “chained” confidence intervals, and prove a cumulative r...
متن کاملBetter Fair Algorithms for Contextual Bandits ∗ Matthew
We study fairness in the linear bandit setting. Starting from the notion of meritocratic fairness introduced in Joseph et al. [11], we introduce a sufficiently more general model in which meritocratic fairness can be imposed and satisfied. We then perform a more fine-grained analysis which achieves better performance guarantees in this more general model. Our work therefore studies fairness for...
متن کاملCBRAP: Contextual Bandits with RAndom Projection
Contextual bandits with linear payoffs, which are also known as linear bandits, provide a powerful alternative for solving practical problems of sequential decisions, e.g., online advertisements. In the era of big data, contextual data usually tend to be high-dimensional, which leads to new challenges for traditional linear bandits mostly designed for the setting of low-dimensional contextual d...
متن کاملExponentiated Gradient LINUCB for Contextual Multi-Armed Bandits
We present Exponentiated Gradient LINUCB, an algorithm for contextual multi-armed bandits. This algorithm uses Exponentiated Gradient to find the optimal exploration of the LINUCB. Within a deliberately designed offline simulation framework we conduct evaluations with real online event log data. The experimental results demonstrate that our algorithm outperforms surveyed algorithms.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016